HeidelPlace: An Extensible Framework for Geoparsing
نویسندگان
چکیده
Problem: Several geoparsers are available, each with their own gazetteer and toponym recognition and resolution approaches. However, they often lack extensibility, implementations are not accessible, or they are fixed to a particular gazetteer. This makes adjustments to other application domains difficult and prevents easy experimental setup. HeidelPlace provides: •A generic gazetteer model supporting integration of place information from heterogeneous knowledge bases •A pipeline approach enabling implementation and combination of modules for specific geoparsing applications •GUIs for gazetteer browsing and testing developed modules This makes HeidelPlace a unique and valuable tool for experimenting with new geoparsing approaches.
منابع مشابه
GeoCorpora: building a corpus to test and train microblog geoparsers
In this article, we present the GeoCorpora corpus building framework and software tools as well as a geo-annotated Twitter corpus built with these tools to foster research and development in the areas of microblog/Twitter geoparsing and geographic information retrieval. The developed framework employs crowdsourcing and geovisual analytics to support the construction of large corpora of text in ...
متن کاملMulti-lingual Geoparsing based on Machine Translation
Our method for multi-lingual geoparsing uses monolingual tools and resources along with machine translation and alignment to return location words in many languages. Not only does our method save the time and cost of developing geoparsers for each language separately, but also it allows the possibility of a wide range of language capabilities within a single interface. We evaluated our method i...
متن کاملGeographic Information Retrieval and Visualization of Online Unstructured Documents
Newspapers, travel narratives, blogs, books and the Internet hold a huge amount of geographic information that can be extracted in order to provide visual exploration. Also, the understanding of place references involves knowledge of the document context. In this way, the study of tools for disambiguation is needed. For the automatic annotation of time and location, both shared world knowledge ...
متن کاملGeo information extraction and processing from travel narratives
Travel narratives published in electronic formats can be very important especially to the tourism community because of the great amount of knowledge that can be extracted. However, the low exploitation of these documents opens a new area of opportunity to the computing community. In this way, this article explores new ways to visualize travel narratives in a map in order to take advantage of ex...
متن کاملA Framework for Building Extensible C++ Class Libraries
Extensibility leads to better designed and more reusable software. Traditionally, implementors have built extensible C++ software using ad hoc mechanisms built from scratch. This paper identifies specific characteristics that constitute extensible software. A framework for building extensible C++ libraries has been defined and constructed on AIX 3.2. Finally, the paper gives guidelines for impl...
متن کامل